Greedy Model Averaging

نویسندگان

  • Dong Dai
  • Tong Zhang
چکیده

This paper considers the problem of combining multiple models to achieve a prediction accuracy not much worse than that of the best single model for least squares regression. It is known that if the models are mis-specified, model averaging is superior to model selection. Specifically, let n be the sample size, then the worst case regret of the former decays at the rate of O(1/n) while the worst case regret of the latter decays at the rate of O(1/ √ n). In the literature, the most important and widely studied model averaging method that achieves the optimal O(1/n) average regret is the exponential weighted model averaging (EWMA) algorithm. However this method suffers from several limitations. The purpose of this paper is to present a new greedy model averaging procedure that improves EWMA. We prove strong theoretical guarantees for the new procedure and illustrate our theoretical results with empirical examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LSBN: A Large-Scale Bayesian Structure Learning Framework for Model Averaging

The motivation for this paper is to apply Bayesian structure learning using Model Averaging in large-scale networks. Currently, Bayesian model averaging algorithm is applicable to networks with only tens of variables, restrained by its super-exponential complexity. We present a novel framework, called LSBN(Large-Scale Bayesian Network), making it possible to handle networks with infinite size b...

متن کامل

Bayesian Additive Regression Trees using Bayesian model averaging

Bayesian Additive Regression Trees (BART) is a statistical sum of trees model. It can be considered a Bayesian version of machine learning tree ensemble methods where the individual trees are the base learners. However for datasets where the number of variables p is large (e.g. p > 5, 000) the algorithm can become prohibitively expensive, computationally. Another method which is popular for hig...

متن کامل

Using Greedy Clustering Method to Solve Capacitated Location-Routing Problem with Fuzzy Demands

Using Greedy Clustering Method to Solve Capacitated Location-Routing Problem with Fuzzy Demands Abstract In this paper, the capacitated location routing problem with fuzzy demands (CLRP_FD) is considered. In CLRP_FD, facility location problem (FLP) and vehicle routing problem (VRP) are observed simultaneously. Indeed the vehicles and the depots have a predefined capacity to serve the customerst...

متن کامل

Bayesian Model Averaging Naive Bayes (BMA-NB): Averaging over an Exponential Number of Feature Models in Linear Time

Naive Bayes (NB) is well-known to be a simple but effective classifier, especially when combined with feature selection. Unfortunately, feature selection methods are often greedy and thus cannot guarantee an optimal feature set is selected. An alternative to feature selection is to use Bayesian model averaging (BMA), which computes a weighted average over multiple predictors; when the different...

متن کامل

On The Recovery Of The Consecutive Ones Property By Generalized Reciprocal Averaging Algorithms

In this note we present a general proof of a phenomenon demonstrated in Heiser (1981): if the columns of a (0,1)-table can be permuted such that the 1s in each row form a consecutive interval, then the correct order of the columns can be found using the reciprocal averaging algorithm. Several variants of the reciprocal averaging algorithm for which the same property can be proved are considered.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011